77 research outputs found

    i-rDNA: alignment-free algorithm for rapid in silico detection of ribosomal gene fragments from metagenomic sequence data sets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid <it>in silico</it> identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity.</p> <p>Results</p> <p>Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications.</p> <p>Conclusions</p> <p>In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects.</p> <p>Availability</p> <p>A web-server for the i-rDNA algorithm is available at <url>http://metagenomics.atc.tcs.com/i-rDNA/</url></p

    HabiSign: a novel approach for comparison of metagenomes and rapid identification of habitat-specific sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One of the primary goals of comparative metagenomic projects is to study the differences in the microbial communities residing in diverse environments. Besides providing valuable insights into the inherent structure of the microbial populations, these studies have potential applications in several important areas of medical research like disease diagnostics, detection of pathogenic contamination and identification of hitherto unknown pathogens. Here we present a novel and rapid, alignment-free method called HabiSign, which utilizes patterns of tetra-nucleotide usage in microbial genomes to bring out the differences in the composition of both diverse and related microbial communities.</p> <p>Results</p> <p>Validation results show that the metagenomic signatures obtained using the HabiSign method are able to accurately cluster metagenomes at biome, phenotypic and species levels, as compared to an average tetranucleotide frequency based approach and the recently published dinucleotide relative abundance based approach. More importantly, the method is able to identify subsets of sequences that are specific to a particular habitat. Apart from this, being alignment-free, the method can rapidly compare and group multiple metagenomic data sets in a short span of time.</p> <p>Conclusions</p> <p>The proposed method is expected to have immense applicability in diverse areas of metagenomic research ranging from disease diagnostics and pathogen detection to bio-prospecting. A web-server for the HabiSign algorithm is available at <url>http://metagenomics.atc.tcs.com/HabiSign/</url>.</p

    Metagenome of the gut of a malnourished child

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Malnutrition, a major health problem, affects a significant proportion of preschool children in developing countries. The devastating consequences of malnutrition include diarrhoea, malabsorption, increased intestinal permeability, suboptimal immune response, etc. Nutritional interventions and dietary solutions have not been effective for treatment of malnutrition till date. Metagenomic procedures allow one to access the complex cross-talk between the gut and its microbial flora and understand how a different community composition affects various states of human health. In this study, a metagenomic approach was employed for analysing the differences between gut microbial communities obtained from a malnourished and an apparently healthy child.</p> <p>Results</p> <p>Our results indicate that the malnourished child gut has an abundance of enteric pathogens which are known to cause intestinal inflammation resulting in malabsorption of nutrients. We also identified a few functional sub-systems from these pathogens, which probably impact the overall metabolic capabilities of the malnourished child gut.</p> <p>Conclusion</p> <p>The present study comprehensively characterizes the microbial community resident in the gut of a malnourished child. This study has attempted to extend the understanding of the basis of malnutrition beyond nutrition deprivation.</p

    iVikodak—A Platform and Standard Workflow for Inferring, Analyzing, Comparing, and Visualizing the Functional Potential of Microbial Communities

    Get PDF
    Background: The objectives of any metagenomic study typically include identification of resident microbes and their relative proportions (taxonomic analysis), profiling functional diversity (functional analysis), and comparing the identified microbes and functions with available metadata (comparative metagenomics). Given the advantage of cost-effectiveness and convenient data-size, amplicon-based sequencing has remained the technology of choice for exploring phylogenetic diversity of an environment. A recent school of thought, employing the existing genome annotation information for inferring functional capacity of an identified microbiome community, has given a promising alternative to Whole Genome Shotgun sequencing for functional analysis. Although a handful of tools are currently available for function inference, their scope, functionality and utility has essentially remained limited. Need for a comprehensive framework that expands upon the existing scope and enables a standardized workflow for function inference, analysis, and visualization, is therefore felt.Methods: We present iVikodak, a multi-modular web-platform that hosts a logically inter-connected repertoire of functional inference and analysis tools, coupled with a comprehensive visualization interface. iVikodak is equipped with microbial co-inhabitance pattern driven published algorithms along with multiple updated databases of various curated microbe-function maps. It also features an advanced task management and result sharing system through introduction of personalized and portable dashboards.Results: In addition to inferring functions from 16S rRNA gene data, iVikodak enables (a) an in-depth analysis of specific functions of interest (b) identification of microbes contributing to various functions (c) microbial interaction patterns through function-driven correlation networks, and (d) simultaneous functional comparison between multiple microbial communities. We have bench-marked iVikodak through multiple case studies and comparisons with existing state of art. We also introduce the concept of a public repository which provides a first of its kind community-driven framework for scientific data analytics, collaboration and sharing in this area of microbiome research.Conclusion: Developed using modern design and task management practices, iVikodak provides a multi-modular, yet inter-operable, one-stop framework, that intends to simplify the entire approach toward inferred function analysis. It is anticipated to serve as a significant value addition to the existing space of functional metagenomics.iVikodak web-server may be freely accessed at https://web.rniapps.net/iVikodak/

    A cross-sectional study on the nasopharyngeal microbiota of individuals with SARS-CoV-2 infection across three COVID-19 waves in India

    Get PDF
    BackgroundMultiple variants of the SARS-CoV-2 virus have plagued the world through successive waves of infection over the past three years. Independent research groups across geographies have shown that the microbiome composition in COVID-19 positive patients (CP) differs from that of COVID-19 negative individuals (CN). However, these observations were based on limited-sized sample-sets collected primarily from the early days of the pandemic. Here, we study the nasopharyngeal microbiota in COVID-19 patients, wherein the samples have been collected across the three COVID-19 waves witnessed in India, which were driven by different variants of concern.MethodsThe nasopharyngeal swabs were collected from 589 subjects providing samples for diagnostics purposes at the Centre for Cellular and Molecular Biology (CSIR-CCMB), Hyderabad, India and subjected to 16s rRNA gene amplicon - based sequencing.FindingsWe found variations in the microbiota of symptomatic vs. asymptomatic COVID-19 patients. CP showed a marked shift in the microbial diversity and composition compared to CN, in a wave-dependent manner. Rickettsiaceae was the only family that was noted to be consistently depleted in CP samples across the waves. The genera Staphylococcus, Anhydrobacter, Thermus, and Aerococcus were observed to be highly abundant in the symptomatic CP patients when compared to the asymptomatic group. In general, we observed a decrease in the burden of opportunistic pathogens in the host microbiota during the later waves of infection.InterpretationTo our knowledge, this is the first analytical cross-sectional study of this scale, which was designed to understand the relation between the evolving nature of the virus and the changes in the human nasopharyngeal microbiota. Although no clear signatures were observed, this study shall pave the way for a better understanding of the disease pathophysiology and help gather preliminary evidence on whether interventions to the host microbiota can help in better protection or faster recovery

    Gene prediction with Glimmer for metagenomic sequences augmented by classification and clustering

    Get PDF
    Environmental shotgun sequencing (or metagenomics) is widely used to survey the communities of microbial organisms that live in many diverse ecosystems, such as the human body. Finding the protein-coding genes within the sequences is an important step for assessing the functional capacity of a metagenome. In this work, we developed a metagenomics gene prediction system Glimmer-MG that achieves significantly greater accuracy than previous systems via novel approaches to a number of important prediction subtasks. First, we introduce the use of phylogenetic classifications of the sequences to model parameterization. We also cluster the sequences, grouping together those that likely originated from the same organism. Analogous to iterative schemes that are useful for whole genomes, we retrain our models within each cluster on the initial gene predictions before making final predictions. Finally, we model both insertion/deletion and substitution sequencing errors using a different approach than previous software, allowing Glimmer-MG to change coding frame or pass through stop codons by predicting an error. In a comparison among multiple gene finding methods, Glimmer-MG makes the most sensitive and precise predictions on simulated and real metagenomes for all read lengths and error rates tested

    Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads

    Get PDF
    Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data- sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes
    corecore